Topic-based Evaluation for Conversational Bots

نویسندگان

  • Fenfei Guo
  • Angeliki Metallinou
  • Chandra Khatri
  • Anirudh Raju
  • Anu Venkatesh
  • Ashwin Ram
چکیده

Dialog evaluation is a challenging problem, especially for non task-oriented dialogs where conversational success is not well-defined. We propose to evaluate dialog quality using topic-based metrics that describe the ability of a conversational bot to sustain coherent and engaging conversations on a topic, and the diversity of topics that a bot can handle. To detect conversation topics per utterance, we adopt Deep Average Networks (DAN) and train a topic classifier on a variety of question and query data categorized into multiple topics. We propose a novel extension to DAN by adding a topic-word attention table that allows the system to jointly capture topic keywords in an utterance and perform topic classification. We compare our proposed topic based metrics with the ratings provided by users and show that our metrics both correlate with and complement human judgment. Our analysis is performed on tens of thousands of real human-bot dialogs from the Alexa Prize competition and highlights user expectations for conversational bots.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Conversational Agent Based on a Conceptual Interpretation of a Data Driven Semantic Space

In this work we propose an interpretation of the LSA framework which leads to a data-driven “conceptual” space creation suitable for an “intuitive” conversational agent. The proposed approach allows overcoming the limitations of traditional, rule-based, chat-bots, leading to a more natural dialogue.

متن کامل

Emulating Human Conversations using Convolutional Neural Network-based IR

Conversational agents (“bots”) are beginning to be widely used in conversational interfaces. To design a system that is capable of emulating human-like interactions, a conversational layer that can serve as a fabric for chat-like interaction with the agent is needed. In this paper, we introduce a model that employs Information Retrieval by utilizing convolutional deep structured semantic neural...

متن کامل

Modelling Affordances for the Control and Evaluation of Intrinsically Motivated Robots

In psychological theory, affordances provide a way to describe an environment in terms of the opportunities it provides an organism to act. Affordance-based models have been applied to robotics in areas such as tool-use, interaction and vision, as an alternative to hybrid control architectures. This paper introduces a model of affordances for controlling and evaluating intrinsically motivated r...

متن کامل

Ranking Responses Oriented to Conversational Relevance in Chat-bots

For automatic chatting systems, it is indeed a great challenge to reply the given query considering the conversation history, rather than based on the query only. This paper proposes a deep neural network to address the context-aware response ranking problem by end-to-end learning, so as to help to select conversationally relevant candidate. By combining the multi-column convolutional layer and...

متن کامل

Topic Segmentation and Labeling in Asynchronous Conversations

Topic segmentation and labeling is often considered a prerequisite for higher-level conversation analysis and has been shown to be useful in many Natural Language Processing (NLP) applications. We present two new corpora of email and blog conversations annotated with topics, and evaluate annotator reliability for the segmentation and labeling tasks in these asynchronous conversations. We propos...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1801.03622  شماره 

صفحات  -

تاریخ انتشار 2018